Airbnb has become an increasingly popular way for travelers to find accommodations around the world. With a wide range of options available, from budget-friendly shared rooms to luxurious villas, Airbnb has something for everyone.
The intent of this data analysis report aims to identify major differences in the Airbnb market between cities and identify the attributes that have the biggest influence on price.
To achieve this, we will collect data from Airbnb listings across multiple cities and perform an exploratory data analysis to identify patterns and trends in the data.
We will investigate how factors such as location, property type, amenities, and availability affect the price of Airbnb rentals in different cities. Additionally, we will identify the most influential attributes on price and provided insights and recommendations for Airbnb hosts and potential guests
In conclusion, we will provide insights and recommendations to Airbnb hosts and potential guests on how to optimize their listings and bookings, as well as to inform policymakers and industry stakeholders on the state of the Airbnb market across different cities.
The packages required to run the code in this study are the following:
library(readxl) # To read in the data## Warning: package 'readxl' was built under R version 4.2.2
library(tidyverse) # For general data manipulation and regression analysis## Warning: package 'tidyverse' was built under R version 4.2.2
## Warning: package 'ggplot2' was built under R version 4.2.2
## Warning: package 'tibble' was built under R version 4.2.2
## Warning: package 'tidyr' was built under R version 4.2.2
## Warning: package 'readr' was built under R version 4.2.2
## Warning: package 'purrr' was built under R version 4.2.2
## Warning: package 'dplyr' was built under R version 4.2.2
## Warning: package 'stringr' was built under R version 4.2.2
## Warning: package 'forcats' was built under R version 4.2.2
## Warning: package 'lubridate' was built under R version 4.2.2
library(knitr) # To format tables## Warning: package 'knitr' was built under R version 4.2.2
library(DAAG) # To provide collection of function and datasets## Warning: package 'DAAG' was built under R version 4.2.2
library(ggplot2) # To visualize data and build visualization
library(psych) # To produce most frequented requested stats and then read the data frame ## Warning: package 'psych' was built under R version 4.2.2
library(readr) # To read and parse structured data files
library(dplyr) # To manipulate and transform data
library(GGally) # To create complex and multivariate visualizations## Warning: package 'GGally' was built under R version 4.2.2
#library(qwraps2) # To make summary tables
#library(naniar) # To replace missing values
#library(formattable) # To format tables into currency
#library(MASS) # To calculate VIFOur data sets include the Listings, and Reviews data sets included in the Airbnb Listings & Reviews. All of the Airbnb data shows <add more here for 2,469(add correct) listings over the span of a 13 years. We will read all of this data into R.
listings <- read.csv("C:/Users/sneha/Downloads/Airbnb+Data/Airbnb Data/Listings.csv")
listingsreviews <- read.csv('C:/Users/sneha/Downloads/Airbnb+Data/Airbnb Data/Reviews.csv')
reviewsroom_type <- c("Entire place", "Hotel room", "Private room", "Shared room")
room_type_encoded <- factor(room_type, levels = unique(room_type))
as.integer(room_type_encoded)## [1] 1 2 3 4
#listing_newThe first part of data cleaning involves removal of data columns which aren’t required from the data sets.To keep the data not messy, we will simplify each data set to only include the variables we want to analyze.
To focus more on the ‘Price’ segment we added a filter to filter out prices for the listings in the listings data frame.
listing_price <- filter(listings,price>5000 & price<30000)
listing_priceroom_type <- c("Entire place", "Hotel room", "Private room", "Shared room")
room_type_encoded <- factor(room_type, levels = unique(room_type))
as.integer(room_type_encoded)## [1] 1 2 3 4
listing_new <- listing_price %>%
subset(select = -c(host_id ,host_since,host_location, host_total_listings_count, host_has_profile_pic, host_identity_verified ,latitude,longitude,district,name))reviews_new <- reviews %>%
subset(select = -c(review_id, reviewer_id))Here, we have removed the rows with null values for the effective cleaning of the dataset.
# Confirm all missing values are taken care of
listing_new <- na.omit(listing_new)listing_newListing the column names to confirm all missing values are taken care of.
colSums(is.na(listing_new))## listing_id host_response_time
## 0 0
## host_response_rate host_acceptance_rate
## 0 0
## host_is_superhost neighbourhood
## 0 0
## city property_type
## 0 0
## room_type accommodates
## 0 0
## bedrooms amenities
## 0 0
## price minimum_nights
## 0 0
## maximum_nights review_scores_rating
## 0 0
## review_scores_accuracy review_scores_cleanliness
## 0 0
## review_scores_checkin review_scores_communication
## 0 0
## review_scores_location review_scores_value
## 0 0
## instant_bookable
## 0
We will introduce a new variable called Total_Amenities within the transactions data set to help us better understand the price trends.
We also have other existing key variables such as city,room_type, price, amenities.
listing_new <- listing_new %>% mutate(Total_amenities = str_count(amenities, ",") +1)listing_newData Dictionary This dataset includes the following variables:
listing_dict <- read.csv('C:/Users/sneha/Downloads/Airbnb+Data/Airbnb Data/Listings_data_dictionary.csv')
listing_dictlisting_newThe data captured is for the price range from 5000 and abpve. There are 1104 observations in the dataset that range over 9 cities.
The following characteristics that were captured for each sold home are repeated below, and explained in the prior tab:
colnames(listing_new)## [1] "listing_id" "host_response_time"
## [3] "host_response_rate" "host_acceptance_rate"
## [5] "host_is_superhost" "neighbourhood"
## [7] "city" "property_type"
## [9] "room_type" "accommodates"
## [11] "bedrooms" "amenities"
## [13] "price" "minimum_nights"
## [15] "maximum_nights" "review_scores_rating"
## [17] "review_scores_accuracy" "review_scores_cleanliness"
## [19] "review_scores_checkin" "review_scores_communication"
## [21] "review_scores_location" "review_scores_value"
## [23] "instant_bookable" "Total_amenities"
Below are some summary statistics of the listings sales price, review scores rating:
summary(listing_new$price)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5001 6143 7880 9037 10000 29500
summary(listing_new$review_scores_rating)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 20.00 95.00 99.00 96.09 100.00 100.00
The average price of listing in Airbnb are $344,774, but this price ranges from $16,000 to $2,700,000.
The average size of homes in Cincinnati is 2,043 square feet, which sit on an average lot size of 11,809 square feet. (This is approximately 0.2711043 acres.)
The visualization contains spread of the data for the key variables.
x1 <- listing_new$Total_amenities
y <- listing_new$price
ggplot(data.frame(x1, y), aes(x = x1, y = y)) + geom_point(color="steelblue") + ylim(5000, max(y)) + labs(x = "Total_amenities", y = "Price", title = "Plot with Price starting from 5000" )This plot shows the relationship between the type of room (room_type) and their prices. The ylim() function is used to focus on prices above 5000, while the labs() function adds descriptive labels to the plot.
Note that if room_type is a categorical variable **(e.g., "Entire home/apt", "Private room", "Shared room")**, then the plot will display points at discrete locations on the x-axis for each category. If room_type is a continuous variable (e.g., a numerical score or rating), then the plot will display points at a continuous range of values on the x-axis.x2 <- listing_new$room_type
y <- listing_new$price
ggplot(data.frame(x2, y), aes(x = x2, y = y)) + geom_point(color="steelblue") + ylim(5000, max(y)) + labs(x = "Room Type", y = "Price", title = "Plot with Price starting from 5000")This plot shows the relationship between the review score rating (review_scores_rating) and the prices. The ylim() function is used to focus on prices above 5000, while the labs() function adds descriptive labels to the plot.
Note that if review_scores_rating is a continuous variable (e.g., a numerical score), then the plot will display points at a continuous range of values on the x-axis. If review_scores_rating is a categorical variable (e.g., "Excellent", "Good", "Average", etc.), then the plot will display points at discrete locations on the x-axis for each category.x3 <- listing_new$review_scores_rating
y <- listing_new$price
ggplot(data.frame(x3, y), aes(x = x3, y = y)) + geom_point(color="steelblue") + ylim(5000, max(y)) + labs(x = "Review Score Rating", y = "Price", title = "Plot with Price starting from 5000")This plot shows the relationship between the city and the prices. The ylim() function is used to focus on prices above 5000, while the labs() function adds descriptive labels to the plot.
Note that if city is a categorical variable (e.g., "New York", "Los Angeles", "Chicago", etc.), then the plot will display points at discrete locations on the x-axis for each category. If city is a continuous variable (e.g., a numerical score or index), then the plot will display points at a continuous range of values on the x-axis.x <- listing_new$city
y <- listing_new$price
ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point(color="steelblue") + ylim(5000, max(y)) + labs(x = "City", y = "Price", title = "Plot with Price starting from 5000")This plot shows the relationship between the instant_bookable variable and the prices. The ylim() function is used to focus on prices above 5000, while the labs() function adds descriptive labels to the plot.
Note that if instant_bookable is a categorical variable (e.g., "True" or "False"), then the plot will display points at discrete locations on the x-axis for each category. If instant_bookable is a continuous variable (e.g., a numerical score or index), then the plot will display points at a continuous range of values on the x-axis.x <- listing_new$instant_bookable
y <- listing_new$price
ggplot(data.frame(x, y), aes(x = x, y = y)) + geom_point(color="steelblue") + ylim(5000, max(y)) + labs(x = "Instant Bookable", y = "Price", title = "Plot with price starting from 5000")This plot shows the relationship between the bedrooms variable and the prices. The ylim() function is used to focus on prices above 5000, while the labs() function adds descriptive labels to the plot.
Note that if bedrooms is a categorical variable (e.g., "1 bedroom", "2 bedrooms", etc.), then the plot will display points at discrete locations on the x-axis for each category. If bedrooms is a continuous variable (e.g., a numerical count), then the plot will display points at a continuous range of values on the x-axis.x5 <- listing_new$bedrooms
y <- listing_new$price
ggplot(data.frame(x5, y), aes(x = x5, y = y)) + geom_point(color="steelblue") + ylim(5000, max(y)) + labs(x = "Bedrooms", y = "Price", title = "Plot with price starting from 5000")This plot shows the relationship between the minimum_nights variable and the prices. The xlim() function is used to focus on minimum_nights values between 0 and 25, while the ylim() function is used to focus on prices above 5000. The labs() function adds descriptive labels to the plot.
Note that if minimum_nights is a categorical variable (e.g., "1 night", "2 nights", etc.), then the plot will display points at discrete locations on the x-axis for each category. If minimum_nights is a continuous variable (e.g., a numerical count), then the plot will display points at a continuous range of values on the x-axis.x6 <- listing_new$minimum_nights
y <- listing_new$price
ggplot(data.frame(x6, y), aes(x = x6, y = y)) + geom_point(color="steelblue") + xlim(0, 25) + ylim(5000, max(y)) + labs(x = "Minimum Nights", y = "Price", title = "Plot with price starting from 5000")## Warning: Removed 20 rows containing missing values (`geom_point()`).
This plot shows the relationship between the host_response_rate* variable and the prices. The ylim() function is used to focus on prices above 5000. The labs() function adds descriptive labels to the plot.
Note that host_response_rate should be a continuous variable (e.g., a percentage) for this plot to make sense. If host_response_rate is a categorical variable (e.g., "within an hour", "within a day", etc.), then the plot will display points at discrete locations on the x-axis for each category.x7 <- listing_new$host_response_rate
y <- listing_new$price
ggplot(data.frame(x7, y), aes(x = x7, y = y)) + geom_point(color="steelblue") + ylim(5000, max(y)) + labs(x = "Host Response Rate", y = "Price", title = "Plot with Price starting from 5000")The same data but by zip code:
plot(listing_new$price , col = "steelblue", lwd = -0.5)
The resulting airbnb_agg data frame will have one row for each unique
combination of city and room_type, with columns for the mean price and
mean minimum_nights values for each group. The column names will reflect
the original variable names used in the cbind() function (i.e., price
and minimum_nights).
Note that the cbind() function is used to combine the price and minimum_nights variables into a single data frame, which is necessary because aggregate() expects a single data frame as its input. Without using cbind(), we would have to specify each variable separately in the formula passed to aggregate(), which can be tedious and error-prone for larger data frames with many variables.
airbnb_agg <- aggregate(cbind(price, minimum_nights) ~ city + room_type, data = listing_new, FUN = mean)
airbnb_aggThe resulting plot will show the mean price values for each unique combination of city and room_type, with different colors used to distinguish between the different room_type values. The x axis will display the city values, and the y axis will display the mean price values. The main title will be “Price as per Room Type”, and the subtitle will be “Price upto 50000”.
ggplot(airbnb_agg, aes(x = city, y = price, color = room_type)) +
geom_point() +
labs(title = "Price as per Room Type",
subtitle = "Price upto 50000",
x = "city",
y = "price",
color = "room type")listings_fit2 <- listing_new[c(3,11,13,14,16,24)]
listings_fit2pairs(listings_fit2,col="steelblue" ,pch = 18,
labels = c("host_response_rate", "bedrooms", "price","minimum_nights","review_scores_rating","Total_amenities"),
main = "This is a pairs plot in R")The “city” column will contain the unique values of the “city” variable in the listings_new data set, and the “price” column will contain the mean price of listings for each city.
average_price_by_city <- aggregate(price ~ city, data = listing_new, FUN = mean)
average_price_by_citybarplot(average_price_by_city$price, names.arg = average_price_by_city$city,las=2 , ylab = "Average Price", main = "Average Price by City", col = "steelblue")review_rating_by_city <- aggregate(price ~ Total_amenities, data = listing_new, FUN = mean)
review_rating_by_cityavg_price_by_room_type <- aggregate(price ~ room_type, data = listing_new, FUN = mean)
barplot(avg_price_by_room_type$price, names.arg = avg_price_by_room_type$room_type, xlab = "Room Type", ylab = "Average Price", main = "Average Price by Room Type")plot(listing_new$minimum_nights, listing_new$price, xlab = "Minimum Nights", ylab = "Price", main = "Price vs. Minimum Nights", col = "steelblue")The process of determining an appropriate model to represent home values in the Cincinnati is two-fold. First, we need to perform a residual analysis. Through this analysis, we can ensure that the full model - that is all the possible covariates along with the response variable - meets the following criteria for linear regression:
The relationship between the regressors and the response variable is approximately linear
Errors are independent
Errors are normally distributed
Error term has an equal/constant variance
If all of these assumptions are not met, variables in the model must be transformed and checked in a process called Model Adequacy Checking, which involves Transformation and Residual Analysis.
Let’s begin with a full model. We can create a dataset on which to fit a model, shown below. Note that we are removing amenities, accomodates, neighbourhood, city and multiple review scores from this dataset, as they are not covariates we want to include in our model. Amenities and neighbourhood are not generalizable.
listings_fit <- listing_new[c(3,9,11,13,14, 16,24)]The residuals function returns the residuals (i.e., the difference between the observed values and the predicted values) from the linear regression model.
In this case, the residuals are the difference between the actual price of each Airbnb listing and the predicted price based on the values of the predictors in the model.
listings_model <- lm(price ~ ., data = listings_fit)Each point on the plot represents the difference between the actual value and the predicted value of the dependent variable price for a particular observation in the dataset. The pch=20 argument specifies the shape of the points on the plot to be circles. The abline function adds a horizontal line at y=0 to make it easier to see the distribution of the residuals around zero.
plot(listings_model$fitted.values,listings_model$residuals,pch=20)
abline(h=0,col="grey")
In this case, the residuals appear to be fairly normally distributed,
except for some deviations from the line in the tails. This suggests
that the model is a reasonable fit to the data, but there may be some
outliers or heavy-tailed distributions that are not captured by the
model.
qqnorm(listings_model$residuals,main="listings_model")
qqline(listings_model$residuals)We have satisfied the linear regression assumptions by transforming the variables accordingly. Now we can move into the phase of selecting the variables from this full model to determine the optimal combination/selection of regressors to best model their relationship with Final listing Price.
Forward Selection + Backward Elimination
When attempting to use the step function, I keep getting a message that says “attempting model selection on an essentially perfect fit is nonsense”, but when tested, we get a predicted R squared of above around 50%. Our hypothesis is that this is due to varying levels of certain amenities. Therefore we would be more beneficial to manually select variables through forward selection and backward elimination.
drop1(lm(price ~ x1 + x2 + x3 + x5 + x6 + x7,data=listings_fit), test="F")We found the best model to be:
drop1(lm(price ~ x3 + x5 + x6 + x7,data=listings_fit), test="F")drop1(lm(price ~ x5 + x7,data=listings_fit), test="F")The Final Model to be Validated:
finalmodel = lm(price ~ bedrooms + host_response_rate, data=listings_fit)summary(finalmodel)##
## Call:
## lm(formula = price ~ bedrooms + host_response_rate, data = listings_fit)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8671 -2589 -1174 1199 20605
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7708.1 680.4 11.329 <2e-16 ***
## bedrooms 635.7 70.1 9.068 <2e-16 ***
## host_response_rate -1355.6 675.1 -2.008 0.0449 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4074 on 1065 degrees of freedom
## Multiple R-squared: 0.0731, Adjusted R-squared: 0.07136
## F-statistic: 42 on 2 and 1065 DF, p-value: < 2.2e-16
finalmodel1 = lm(price ~ . , data=listings_fit)
summary(finalmodel1)$r.squared## [1] 0.07671024
Here we performed K-fold validation to cross verify the model which is created.
KCV=cv.lm(data=listings_fit, finalmodel1, m=3, seed=123)## Warning in cv.lm(data = listings_fit, finalmodel1, m = 3, seed = 123):
##
## As there is >1 explanatory variable, cross-validation
## predicted values for a fold are not a linear function
## of corresponding overall predicted values. Lines that
## are shown for the different folds are approximate
##
## fold 1
## Observations in test set: 356
## 114 118 128 138 139 140
## Predicted 8954.794 7506.005 8498.680 7366.870 8438.880 9031.452
## cvpred 9172.750 7504.400 8915.677 7288.207 8489.888 8954.982
## price 7153.000 9441.000 5244.000 12000.000 5162.000 6300.000
## CV residual -2019.750 1936.600 -3671.677 4711.793 -3327.888 -2654.982
## 143 147 179 185 187 188
## Predicted 10383.114 9051.819 8761.474 9173.5025 8790.5846 8769.483
## cvpred 10399.048 9102.600 8705.297 9305.4084 8893.2909 8833.620
## price 14455.000 5750.000 12390.000 8559.0000 9550.0000 8247.000
## CV residual 4055.952 -3352.600 3684.703 -746.4084 656.7091 -586.620
## 193 194 200 213 215 216
## Predicted 8503.174 8269.8992 9477.392 8594.6090 8645.9148 8609.366
## cvpred 8645.404 8349.8788 9329.587 8545.2145 8619.5513 8563.363
## price 6999.000 8655.0000 8000.000 8000.0000 9500.0000 9808.000
## CV residual -1646.404 305.1212 -1329.587 -545.2145 880.4487 1244.637
## 217 218 221 222 295 297
## Predicted 7618.226 7632.983 7826.585 8422.681 8845.839 8882.988
## cvpred 7651.144 7669.293 8025.260 8219.345 8982.555 8983.879
## price 5080.000 5906.000 5895.000 6300.000 6143.000 6132.000
## CV residual -2571.144 -1763.293 -2130.260 -1919.345 -2839.555 -2851.879
## 312 314 331 333 334 335
## Predicted 8818.394 8126.874 8250.677 8398.0180 8331.677 8321.704
## cvpred 8881.361 8128.454 8370.030 8365.8562 8547.128 8547.302
## price 10000.000 6377.000 6195.000 9000.0000 6491.000 5571.000
## CV residual 1118.639 -1751.454 -2175.030 634.1438 -2056.128 -2976.302
## 337 360 361 362 373 380 394
## Predicted 7542.419 7958.7980 7562.186 7816.779 6743.727 7000.722 8933.003
## cvpred 7697.101 8453.9142 7630.290 8303.078 6673.916 7127.250 9134.711
## price 5043.000 8175.0000 6121.000 9800.000 8614.000 10219.000 5264.000
## CV residual -2654.101 -278.9142 -1509.290 1496.922 1940.084 3091.750 -3870.711
## 396 399 402 403 405 411
## Predicted 9552.027 9628.618 8875.112 8269.018 12768.875 9459.8087
## cvpred 9720.379 9927.573 8833.920 8516.228 12955.191 9450.2051
## price 6204.000 8253.000 5995.000 5110.000 21830.000 9262.0000
## CV residual -3516.379 -1674.573 -2838.920 -3406.228 8874.809 -188.2051
## 417 418 419 433 434 435
## Predicted 10170.270 8744.05149 7729.805 8461.321 14283.940 8198.17
## cvpred 10294.447 9048.45783 7595.173 8544.869 14168.271 8405.42
## price 11968.000 9026.00000 5200.000 12000.000 5309.000 6000.00
## CV residual 1673.553 -22.45783 -2395.173 3455.131 -8859.271 -2405.42
## 439 443 444 452 495 499
## Predicted 9528.99 8881.832 9564.159 9489.7267 9554.186 8249.3865
## cvpred 9589.09 8923.861 9688.541 9449.6839 9688.715 8446.5255
## price 12500.00 5007.000 16000.000 9000.0000 19308.000 8000.0000
## CV residual 2910.91 -3916.861 6311.459 -449.6839 9619.285 -446.5255
## 503 504 511 514 521 526
## Predicted 8097.5387 9388.405 9849.927 8892.960 11006.496 8905.8721
## cvpred 8158.6198 9597.237 9831.846 8983.705 11366.646 8963.4675
## price 7500.0000 8000.000 6131.000 10429.000 8000.000 8001.0000
## CV residual -658.6198 -1597.237 -3700.846 1445.295 -3366.646 -962.4675
## 536 538 540 551 555 565
## Predicted 9307.1186 8839.6297 9609.300 8427.5233 8312.0449 9033.6348
## cvpred 9404.9528 8804.5187 9787.819 8775.6058 8477.4248 9381.5116
## price 10000.0000 8500.0000 12420.000 9500.0000 8000.0000 9995.0000
## CV residual 595.0472 -304.5187 2632.181 724.3942 -477.4248 613.4884
## 567 568 575 576 581 583
## Predicted 8718.632 9476.815 9687.82656 8959.848 8969.1756 9597.858
## cvpred 8679.392 9469.922 10066.63122 9290.768 9142.4803 9798.025
## price 7257.000 16429.000 10000.00000 10506.000 9714.0000 5286.000
## CV residual -1422.392 6959.078 -66.63122 1215.232 571.5197 -4512.025
## 584 590 592 593 597 600
## Predicted 9502.97 8972.394 9672.379 9590.824 9483.849 8833.331
## cvpred 9647.61 8972.621 10070.114 9778.135 9489.812 8965.973
## price 7457.00 15054.000 11760.000 6561.000 6286.000 5575.000
## CV residual -2190.61 6081.379 1689.886 -3217.135 -3203.812 -3390.973
## 603 612 613 617 626 633
## Predicted 8957.733 9708.238 9036.5738 9595.6983 8964.7671 9554.365
## cvpred 9152.686 10147.934 9361.4476 9829.6889 9172.5764 9755.178
## price 8045.000 9787.000 9025.0000 9077.0000 9018.0000 8708.000
## CV residual -1107.686 -360.934 -336.4476 -752.6889 -154.5764 -1047.178
## 692 693 694 695 696 698
## Predicted 9036.574 8230.579 8868.410 10665.849 10215.501 8833.931
## cvpred 9361.448 8218.677 9032.193 10771.466 10426.957 8911.110
## price 8003.000 5885.000 6700.000 5148.000 5817.000 5500.000
## CV residual -1358.448 -2333.677 -2332.193 -5623.466 -4609.957 -3411.110
## 699 706 708 713 717 719
## Predicted 8825.517 8686.492 8926.9733 8939.526 8214.9078 8846.663
## cvpred 8934.483 8593.108 9023.1384 9242.696 8325.4421 8824.409
## price 7000.000 5800.000 8786.0000 5790.000 7400.0000 9000.000
## CV residual -1934.483 -2793.108 -237.1384 -3452.696 -925.4421 175.591
## 722 726 731 735 739 745
## Predicted 8371.254 8926.973 9640.060 8983.243 8293.8827 8565.2587
## cvpred 8616.483 9023.138 9917.367 9182.261 8397.6898 9193.4762
## price 7607.000 28571.000 10141.000 19520.000 8060.0000 8900.0000
## CV residual -1009.483 19547.862 223.633 10337.739 -337.6898 -293.4762
## 747 778 779 780 782 785 789
## Predicted 8762.124 8265.434 8194.407 8314.670 8399.075 7814.53 7532.492
## cvpred 8722.239 8388.179 8210.908 8527.411 8766.095 8301.51 7527.530
## price 12800.000 8446.000 7000.000 6142.000 10058.000 5386.00 5529.000
## CV residual 4077.761 57.821 -1210.908 -2385.411 1291.905 -2915.51 -1998.530
## 790 793 794 795 796 797
## Predicted 8919.62587 8181.030 8974.516 8987.893 10115.470 8145.861
## cvpred 9073.29859 8149.495 9275.685 9337.097 10125.293 8050.044
## price 9000.00000 12500.000 6000.000 7846.000 8714.000 6500.000
## CV residual -73.29859 4350.505 -3275.685 -1491.097 -1411.293 -1550.044
## 798 799 804 887 890 894
## Predicted 9552.716 9011.064 10759.868 9674.9371 9548.207 8089.238
## cvpred 9698.747 9331.873 10877.049 9553.3977 9617.090 8114.423
## price 5900.000 6500.000 12000.000 10000.0000 5057.000 5219.000
## CV residual -3798.747 -2831.873 1122.951 446.6023 -4560.090 -2895.423
## 902 913 915 919 921 924
## Predicted 8827.700 11004.938 11694.74 9663.853 9185.236 8502.909
## cvpred 8802.018 11311.589 11747.36 9602.062 9583.798 8819.118
## price 7199.000 6743.000 23705.00 13268.000 7429.000 7000.000
## CV residual -1603.018 -4568.589 11957.64 3665.938 -2154.798 -1819.118
## 929 931 933 934 936 938
## Predicted 10321.069 10065.056 9218.44179 8546.840 9823.8236 9229.167
## cvpred 10585.936 9994.986 9439.02462 8513.163 9847.2915 9277.843
## price 5900.000 6500.000 9429.00000 6600.000 9950.0000 7900.000
## CV residual -4685.936 -3494.986 -10.02462 -1913.163 102.7085 -1377.843
## 942 945 950 951 952 955 1078
## Predicted 7771.2571 9209.209 9858.992 8463.556 9147.701 9253.52 11698.92
## cvpred 7881.9729 9231.527 9946.743 8646.480 9019.096 9247.40 11824.66
## price 7057.0000 6500.000 7500.000 12000.000 6900.000 6500.00 5900.00
## CV residual -824.9729 -2731.527 -2446.743 3353.520 -2119.096 -2747.40 -5924.66
## 1080 1089 1090 1099 1104 1108
## Predicted 14634.588 9248.046 10905.195 9050.126 8704.233 8942.286
## cvpred 14505.529 9508.554 10971.615 9215.914 8794.170 9156.169
## price 6500.000 7500.000 14500.000 6622.000 8000.000 6730.000
## CV residual -8005.529 -2008.554 3528.385 -2593.914 -794.170 -2426.169
## 1110 1115 1117 1126 1127 1130
## Predicted 8261.339 9092.844 10580.323 8195.0970 9005.545 9565.404
## cvpred 8348.225 9520.570 11297.953 8189.2758 9504.927 9781.792
## price 12500.000 6000.000 7345.000 7714.0000 6879.000 7113.000
## CV residual 4151.775 -3520.570 -3952.953 -475.2758 -2625.927 -2668.792
## 1132 1138 1144 1145 1152 1159
## Predicted 9650.499 8891.491 8322.573 8575.231 9483.849 8377.2836
## cvpred 9998.843 8993.737 8592.133 9193.302 9489.812 8728.0559
## price 5347.000 23300.000 5342.000 5500.000 15000.000 9000.0000
## CV residual -4651.843 14306.263 -3250.133 -3693.302 5510.188 271.9441
## 1175 1176 1180 1184 1187 1280
## Predicted 9018.41161 8915.621 8898.838 8237.747 8884.771 8412.2794
## cvpred 9281.71254 9066.576 8943.577 8474.776 8903.797 8057.2375
## price 9300.00000 5999.000 7143.000 5500.000 12000.000 7447.0000
## CV residual 18.28746 -3067.576 -1800.577 -2974.776 3096.203 -610.2375
## 1300 1307 1314 1319 1321 1323
## Predicted 12925.867 8805.7905 9668.369 9597.277 11804.747 8352.262
## cvpred 13072.785 8962.7202 9789.983 9683.026 11519.867 8421.485
## price 11750.000 8529.0000 5147.000 6200.000 16347.000 5200.000
## CV residual -1322.785 -433.7202 -4642.983 -3483.026 4827.133 -3221.485
## 1329 1331 1378 1379 1380 1382
## Predicted 10448.03 7915.735 8147.975 11994.375 8261.895 11241.711
## cvpred 10603.25 8031.138 8188.125 12123.313 8463.107 11159.729
## price 28098.00 21433.000 5183.000 6143.000 6000.000 20000.000
## CV residual 17494.75 13401.862 -3005.125 -5980.313 -2463.107 8840.271
## 1384 1387 1402 1407 1409 1418
## Predicted 10224.880 8851.403 13300.55 7518.33460 9631.922 9356.50702
## cvpred 9952.665 9012.477 13322.24 7454.51732 9511.453 9323.16347
## price 5500.000 5504.000 27563.00 7500.00000 20500.000 9300.00000
## CV residual -4452.665 -3508.477 14240.76 45.48268 10988.547 -23.16347
## 1425 1426 1432 1437 1438 1442
## Predicted 10099.933 9565.314 8297.063 10830.116 8858.437 8902.709
## cvpred 10095.544 9748.560 8562.558 11042.721 9032.367 9086.814
## price 6000.000 5200.000 5999.000 7016.000 5055.000 13549.000
## CV residual -4095.544 -4548.560 -2563.558 -4026.721 -3977.367 4462.186
## 1444 1445 1448 1453 1455 1456
## Predicted 8964.633 7816.9204 8319.544 6906.595 9649.719 12597.127
## cvpred 9309.090 7637.5827 8578.965 6775.693 9987.244 12497.887
## price 6929.000 8000.0000 5999.000 5204.000 14571.000 6000.000
## CV residual -2380.090 362.4173 -2579.965 -1571.693 4583.756 -6497.887
## 1457 1458 1460 1465 1758 1761 1823
## Predicted 12737.801 8925.190 10683.79 13982.88 6928.916 6951.486 7062.674
## cvpred 12895.693 9103.221 10581.76 14086.92 6938.379 6988.017 6849.827
## price 18000.000 8000.000 7227.00 22000.00 8000.000 20000.000 10000.000
## CV residual 5104.307 -1103.221 -3354.76 7913.08 1061.621 13011.983 3150.173
## 1837 1897 1905 1910 1914 1917
## Predicted 7468.641 6864.1429 6899.3114 7103.009 6176.4118 7001.912
## cvpred 6992.021 6769.3977 6868.8493 6901.500 5828.5163 6945.771
## price 11644.000 5800.0000 5990.0000 9999.000 5679.0000 5202.000
## CV residual 4651.979 -969.3977 -878.8493 3097.500 -149.5163 -1743.771
## 1922 2149 2181 2208 2216 2217
## Predicted 7332.228 6920.41259 6975.510 9724.0881 6337.345 7148.619
## cvpred 7408.199 6928.52027 6697.671 9616.5612 6155.925 7237.958
## price 5202.000 6900.00000 10000.000 10000.0000 8260.000 10491.000
## CV residual -2206.199 -28.52027 3302.329 383.4388 2104.075 3253.042
## 2297 2304 2306 2308 2310 2313
## Predicted 8230.266 10235.043 10142.225 8209.164 11833.469 10452.917
## cvpred 8288.727 10463.428 10248.118 8229.056 11710.667 10931.008
## price 6500.000 6500.000 5162.000 6000.000 6666.000 6643.000
## CV residual -1788.727 -3963.428 -5086.118 -2229.056 -5044.667 -4288.008
## 2316 2323 2326 2334 2348 2349
## Predicted 9053.580 9614.864 8887.396 8344.274 7964.1597 8907.028
## cvpred 9381.164 9817.742 8953.783 8596.940 7715.4164 9023.486
## price 8000.000 13143.000 5500.000 6105.000 7500.0000 7571.000
## CV residual -1381.164 3325.258 -3453.783 -2491.940 -215.4164 -1452.486
## 2369 2373 2374 2376 2379 2385 2387
## Predicted 8085.216 8227.4162 9670.540 9030.416 8260.515 11348.60 8172.5263
## cvpred 7679.520 8342.0231 9502.746 8857.407 8506.370 11414.82 8139.6369
## price 9998.000 8500.0000 8013.000 5845.000 6500.000 14143.00 8500.0000
## CV residual 2318.480 157.9769 -1489.746 -3012.407 -2006.370 2728.18 360.3631
## 2398 2404 2410 2427 2438 2439
## Predicted 8007.5207 9633.302 9563.469 8265.327 8997.776 10812.778
## cvpred 8152.9895 9468.190 9710.173 8439.455 9303.692 11177.563
## price 7900.0000 8013.000 8118.000 6500.000 5157.000 9840.000
## CV residual -252.9895 -1455.190 -1592.173 -1939.455 -4146.692 -1337.563
## 2444 2447 2461 2463 2465 2466
## Predicted 9557.080 8248.007 9482.379 8193.187 9480.910 9794.846
## cvpred 9838.396 8489.789 9499.844 8107.893 9509.876 9491.245
## price 5500.000 11051.000 12000.000 5747.000 7500.000 25200.000
## CV residual -4338.396 2561.211 2500.156 -2360.893 -2009.876 15708.755
## 2509 2510 2512 2514 2516 2518
## Predicted 8250.766 9753.22690 7548.118 9602.580 9720.056 8945.449
## cvpred 8403.262 10095.51107 7590.510 9697.878 10186.147 9032.823
## price 6000.000 10000.00000 5001.000 11250.000 6283.000 6000.000
## CV residual -2403.262 -95.51107 -2589.510 1552.122 -3903.147 -3032.823
## 2520 2532 2542 2545 2550 2551
## Predicted 8365.376 8379.443 8181.854 8221.762 9341.369 10377.9664
## cvpred 8656.611 8696.392 8364.072 8278.869 9722.549 10862.8022
## price 6743.000 7517.000 6000.000 10000.000 7240.000 9886.0000
## CV residual -1913.611 -1179.392 -2364.072 1721.131 -2482.549 -976.8022
## 2553 2555 2559 2560 2562 2577
## Predicted 10180.2428 10371.62 10160.611 10271.681 10146.5437 8941.320
## cvpred 10294.2735 10821.28 10224.571 10552.848 10184.7899 8913.124
## price 10142.0000 11993.00 18000.000 18000.000 10029.0000 5500.000
## CV residual -152.2735 1171.72 7775.429 7447.152 -155.7899 -3413.124
## 2623 2630 2633 2635 2636 2639
## Predicted 8862.935 8981.774 8926.9733 8907.0279 8233.518 9735.683
## cvpred 9035.503 9192.293 9023.1384 9023.4859 8198.613 10249.127
## price 15357.000 8500.000 8140.0000 8713.0000 6500.000 12286.000
## CV residual 6321.497 -692.293 -883.1384 -310.4859 -1698.613 2036.873
## 2643 2646 2647 2652 2653 2654
## Predicted 8231.735 8460.06663 8876.044 9032.4790 8441.591 8845.194
## cvpred 8278.695 8825.07098 8997.220 9321.4932 8815.386 8834.441
## price 5679.000 8888.00000 5466.000 9429.0000 10000.000 8000.000
## CV residual -2599.695 62.92902 -3531.220 107.5068 1184.614 -834.441
## 2660 2785 2786 2790 2791 2794
## Predicted 8966.550 7373.3150 9586.954 8956.197 8394.930 7957.208
## cvpred 9092.494 7370.2486 9603.040 9143.562 8654.335 7920.983
## price 6600.000 6888.0000 8160.000 6263.000 5162.000 19500.000
## CV residual -2492.494 -482.2486 -1443.040 -2880.562 -3492.335 11579.017
## 2795 2802 2803 2804 2811 2813
## Predicted 7969.267 10288.912 9136.7401 7742.483 8495.495 8973.540
## cvpred 8207.844 10437.424 9494.7399 7895.849 8477.400 9250.272
## price 6143.000 9000.000 8550.0000 8000.000 6650.000 8000.000
## CV residual -2064.844 -1437.424 -944.7399 104.151 -1827.400 -1250.272
## 2816 2824 2825 2827 2828 2830
## Predicted 11117.826 8981.051 8847.5052 7820.218 10863.637 9396.871
## cvpred 11247.801 8882.786 8779.6529 8084.540 11000.387 9735.140
## price 6857.000 6156.000 8200.0000 5171.000 5575.000 8285.000
## CV residual -4390.801 -2726.786 -579.6529 -2913.540 -5425.387 -1450.140
## 2831 2832 2834 2835 2838 2842
## Predicted 8845.643 9822.735 9165.402 9839.3605 9467.203 9072.921
## cvpred 8968.741 10179.151 9548.434 9877.0401 9402.790 9162.271
## price 11000.000 11026.000 18000.000 10450.0000 11000.000 5990.000
## CV residual 2031.259 846.849 8451.566 572.9599 1597.210 -3172.271
## 2879 2881 2882 2886 2887 2891
## Predicted 10386.47 9019.1416 8487.8760 8855.2257 10571.399 9952.094
## cvpred 10551.10 9189.6489 8536.2918 8708.8508 10337.738 9818.477
## price 7134.00 8950.0000 7938.0000 9000.0000 7282.000 5700.000
## CV residual -3417.10 -239.6489 -598.2918 291.1492 -3055.738 -4118.477
## 2892 2893 2913 2928 2961 2964
## Predicted 9848.6721 10604.704 9530.5238 10979.161 8866.295 9067.6476
## cvpred 10007.3895 10581.199 9612.1726 11128.822 8894.112 9420.9448
## price 10600.0000 7000.000 10000.0000 17500.000 5137.000 9143.0000
## CV residual 592.6105 -3581.199 387.8274 6371.178 -3757.112 -277.9448
## 2966 2969 2973 2980 2994 2995
## Predicted 8380.913 9022.506 9588.7542 10227.409 8871.304 10149.438
## cvpred 8686.360 9321.667 9843.0302 10498.401 9181.875 10334.471
## price 6459.000 5335.000 9000.0000 7523.000 5950.000 7890.000
## CV residual -2227.360 -3986.667 -843.0302 -2975.401 -3231.875 -2444.471
## 2996 2999 3003 3006 3011 3201
## Predicted 9609.300 7723.782 8319.392 8331.766 10235.357 10344.3523
## cvpred 9787.819 8021.304 8427.265 8580.359 10393.378 10513.1414
## price 12500.000 10357.000 15000.000 5500.000 7000.000 11102.0000
## CV residual 2712.181 2335.696 6572.735 -3080.359 -3393.378 588.8586
## 3202 3203 3204 3208 3212 3217
## Predicted 10063.272 11469.993 10196.779 9778.284 9077.481 8753.4978
## cvpred 10189.946 11602.441 10331.653 10058.241 9283.876 8669.5712
## price 12066.000 15923.000 13950.000 5767.000 7950.000 8888.0000
## CV residual 1876.054 4320.559 3618.347 -4291.241 -1333.876 218.4288
## 3220 3237 3291 3295 3305 3306
## Predicted 8395.598 8380.841 13365.417 10131.007 8342.805 8284.000
## cvpred 8411.944 8393.796 13524.451 10155.041 8606.972 8431.095
## price 6500.000 6500.000 8600.000 8648.000 5571.000 6174.000
## CV residual -1911.944 -1893.796 -4924.451 -1507.041 -3035.972 -2257.095
## 3309 3310 3313 3315 3325 3327
## Predicted 8313.980 8277.432 8210.634 8242.8635 7231.690 7568.350
## cvpred 8549.043 8492.855 8219.024 8338.5401 7101.399 7605.349
## price 5343.000 6428.000 9520.000 7647.0000 5850.000 6286.000
## CV residual -3206.043 -2064.855 1300.976 -691.5401 -1251.399 -1319.349
## 3328 3339 3340 3341 3342 3343 3364
## Predicted 7794.899 6957.051 7126.174 6958.520 7541.729 7312.645 7010.4710
## cvpred 8231.807 7017.940 7425.257 7007.908 7718.733 7448.242 7230.3581
## price 11559.000 5071.000 9272.000 5403.000 6900.000 10000.000 7406.0000
## CV residual 3327.193 -1946.940 1846.743 -1604.908 -818.733 2551.758 175.6419
## 3373 3374 3376 3378 3442 3443
## Predicted 10778.165 8918.1564 8237.299 8868.920 8356.272 10276.555
## cvpred 10820.271 9083.3306 8308.618 8944.098 8701.617 10604.402
## price 13502.000 8759.0000 5478.000 13489.000 6625.000 6132.000
## CV residual 2681.729 -324.3306 -2830.618 4544.902 -2076.617 -4472.402
## 3447 3454
## Predicted 8328.738 7641.49176
## cvpred 8567.192 7920.70217
## price 6107.000 8000.00000
## CV residual -2460.192 79.29783
##
## Sum of squares = 5332029630 Mean square = 14977611 n = 356
##
## fold 2
## Observations in test set: 356
## 110 111 113 127 144 160
## Predicted 9380.4575 8156.568 7503.577 9201.079 8490.876 9733.5411
## cvpred 9463.0658 8373.965 7754.378 9507.559 8684.496 9811.6189
## price 9000.0000 5071.000 5500.000 6500.000 5162.000 10014.0000
## CV residual -463.0658 -3302.965 -2254.378 -3007.559 -3522.496 202.3811
## 165 166 167 170 180 181
## Predicted 9074.011 9618.111 7868.058 9621.18 8424.7589 9729.912
## cvpred 9282.574 10068.803 8149.410 9628.00 8565.8169 9805.434
## price 10800.000 6500.000 5162.000 13591.00 7847.0000 12424.000
## CV residual 1517.426 -3568.803 -2987.410 3963.00 -718.8169 2618.566
## 190 195 204 208 219 294
## Predicted 8760.004 8129.3591 8127.604 7770.315 7689.253 8592.759
## cvpred 8957.506 8534.7176 8388.233 7990.097 7880.891 8881.008
## price 7227.000 8959.0000 11357.000 6895.000 5500.000 11599.000
## CV residual -1730.506 424.2824 2968.767 -1095.097 -2380.891 2717.992
## 296 305 307 311 322 323 324
## Predicted 8817.704 8622.875 8846.439 8885.927 8150.9144 8165.493 8084.762
## cvpred 8934.733 8861.328 8926.227 8944.501 8359.2555 8326.619 8332.726
## price 5885.000 9280.000 6000.000 16286.000 9301.0000 6429.000 9800.000
## CV residual -3049.733 418.672 -2926.227 7341.499 941.7445 -1897.619 1467.274
## 327 330 332 336 358 359
## Predicted 8307.636 8338.710 8314.6701 7615.651 7529.777 7646.411
## cvpred 8403.211 8429.125 8406.7466 7825.365 7777.055 7825.383
## price 5457.000 5776.000 8000.0000 8000.000 6300.000 5162.000
## CV residual -2946.211 -2653.125 -406.7466 174.635 -1477.055 -2663.383
## 365 372 381 390 391 392
## Predicted 7562.0064 6947.257 6956.450 7682.359 8181.0295 8171.236
## cvpred 7782.9595 7252.227 7256.062 7854.533 8339.5753 8358.388
## price 6880.0000 10220.000 10219.000 5162.000 8000.0000 5851.000
## CV residual -902.9595 2967.773 2962.938 -2692.533 -339.5753 -2507.388
## 393 397 401 404 408 409
## Predicted 8209.1644 9452.775 9568.253 11469.640 8140.252 8727.000
## cvpred 8353.7166 9464.598 9532.935 11249.579 8349.535 8925.532
## price 8000.0000 6768.000 7562.000 6657.000 7000.000 7373.000
## CV residual -353.7166 -2696.598 -1970.935 -4592.579 -1349.535 -1552.532
## 414 415 420 441 448 449
## Predicted 8609.631 8661.627 8251.501 8338.710 11410.100 7920.610
## cvpred 8908.076 8910.475 8434.335 8429.125 11215.721 8184.545
## price 7260.000 7728.000 6808.000 5938.000 10000.000 6000.000
## CV residual -1648.076 -1182.475 -1626.335 -2491.125 -1215.721 -2184.545
## 454 502 519 527 528 530
## Predicted 8378.287 8240.417 9121.292 8310.575 9517.862 8255.775
## cvpred 8464.460 8481.382 9098.834 8414.983 9528.197 8392.587
## price 7213.000 5988.000 6999.000 5514.000 5030.000 6000.000
## CV residual -1251.460 -2493.382 -2099.834 -2900.983 -4498.197 -2392.587
## 531 534 537 554 560 564
## Predicted 9049.799 8913.3717 9581.165 7801.9323 8365.376 8836.6908
## cvpred 9083.490 8964.2296 9560.015 7914.6338 8437.380 8919.7539
## price 8000.000 9214.0000 6000.000 7071.0000 6833.000 8500.0000
## CV residual -1083.490 249.7704 -3560.015 -843.6338 -1604.380 -419.7539
## 577 578 579 585 587 589
## Predicted 8919.9396 8226.171 10153.58 9900.632 9461.5469 10120.6130
## cvpred 8982.1868 8372.559 10077.76 10069.301 9525.1997 10102.7285
## price 8000.0000 9544.000 17671.00 5981.000 10070.0000 9195.0000
## CV residual -982.1868 1171.441 7593.24 -4088.301 544.8003 -907.7285
## 594 595 596 601 614 629
## Predicted 10208.467 10299.816 9643.554 9772.231 12041.67583 10201.3440
## cvpred 10117.217 10146.116 9588.585 9635.460 11798.80689 10096.6215
## price 7392.000 11172.000 11471.000 18857.000 11733.00000 11060.0000
## CV residual -2725.217 1025.884 1882.415 9221.540 -65.80689 963.3785
## 636 689 690 691 697 700
## Predicted 9591.165 8934.600 8186.863 9873.259 9434.147 8299.312
## cvpred 9649.760 9472.389 8388.405 10166.356 9547.817 8427.910
## price 8280.000 8000.000 6367.000 8060.000 8389.000 5814.000
## CV residual -1369.760 -1472.389 -2021.405 -2106.356 -1158.817 -2613.910
## 701 702 705 709 711 715
## Predicted 8935.342 8213.573 9672.603 8307.9501 8216.288 8831.127
## cvpred 9003.369 8371.375 9605.974 8429.1063 8374.312 8922.105
## price 9984.000 5157.000 11232.000 8736.0000 6250.000 7177.000
## CV residual 980.631 -3214.375 1626.026 306.8937 -2124.312 -1745.105
## 716 720 721 724 727 734
## Predicted 9623.367 9060.614 9424.819 8959.203 9582.410 8338.710
## cvpred 9581.227 9052.893 9484.577 8991.627 9557.066 8429.125
## price 12400.000 29500.000 8071.000 11000.000 5995.000 7000.000
## CV residual 2818.773 20447.107 -1413.577 2008.373 -3562.066 -1429.125
## 736 738 740 744 749 769
## Predicted 8976.209 8987.651 9524.895 8870.704 8482.558 8149.445
## cvpred 9010.469 9031.663 9531.732 8957.440 8926.494 8353.369
## price 18720.000 15000.000 24800.000 12480.000 5872.000 5500.000
## CV residual 9709.531 5968.337 15268.268 3522.560 -3054.494 -2853.369
## 772 774 787 788 792 802
## Predicted 8160.063 11457.6421 8228.8857 8172.616 7639.377 10762.628
## cvpred 8388.375 11225.7472 8375.4968 8347.214 7821.848 10619.069
## price 6843.000 10857.0000 9000.0000 7186.000 6583.000 7908.000
## CV residual -1545.375 -368.7472 624.5032 -1161.214 -1238.848 -2711.069
## 803 805 808 817 818 891
## Predicted 8722.8161 8136.202 9990.3772 7538.145 7590.141 6742.485
## cvpred 8916.7087 8400.118 10032.9075 7794.702 7797.101 8117.018
## price 8050.0000 5500.000 10500.0000 10000.000 9999.000 5265.000
## CV residual -866.7087 -2900.118 467.0925 2205.298 2201.899 -2852.018
## 892 893 896 898 905 906
## Predicted 8749.324 8050.619 8948.204 8955.9278 11314.143 9891.156
## cvpred 9087.174 8410.178 9109.996 9107.9442 11323.089 10046.869
## price 6500.000 6300.000 13500.000 9507.0000 8359.000 18000.000
## CV residual -2587.174 -2110.178 4390.004 399.0558 -2964.089 7953.131
## 908 909 911 914 918 923
## Predicted 9500.233 9625.950 9084.038 9016.752 10273.992 8406.182
## cvpred 9564.477 9841.503 9087.831 9088.500 10213.466 8581.188
## price 8099.000 11500.000 6000.000 17493.000 5325.000 5835.000
## CV residual -1465.477 1658.497 -3087.831 8404.500 -4888.466 -2746.188
## 927 928 932 937 939 943
## Predicted 7577.215 9482.581 9280.411 9200.3920 10097.646 7864.805
## cvpred 8681.178 9525.935 9456.778 9303.3131 10116.624 8072.078
## price 5203.000 15704.000 14000.000 9000.0000 10682.000 6000.000
## CV residual -3478.178 6178.065 4543.222 -303.3131 565.376 -2072.078
## 944 946 947 953 956 1073
## Predicted 8177.273 10320.502 9091.835 8793.267 8430.6094 9076.1909
## cvpred 8414.958 10170.927 9214.261 9064.251 8636.4333 9162.4577
## price 10999.000 6759.000 5900.000 5300.000 8000.0000 9500.0000
## CV residual 2584.042 -3411.927 -3314.261 -3764.251 -636.4333 337.5423
## 1079 1087 1098 1100 1105 1109
## Predicted 10308.762 9485.970 8556.340 9890.885 8945.135 8362.750
## cvpred 10632.538 9589.439 9177.315 9868.084 8984.556 8451.503
## price 7500.000 7229.000 5786.000 5857.000 5500.000 5081.000
## CV residual -3132.538 -2360.439 -3391.315 -4011.084 -3484.556 -3370.503
## 1119 1124 1128 1131 1133 1135
## Predicted 8855.032 7633.8130 8883.0774 8306.257 9630.087 9531.615
## cvpred 8952.708 7824.1989 8949.7893 8414.385 9558.867 9509.372
## price 11618.000 8500.0000 8614.0000 6139.000 13586.000 5786.000
## CV residual 2665.292 675.8011 -335.7893 -2275.385 4027.133 -3723.372
## 1136 1143 1147 1148 1150 1154
## Predicted 9621.674 7666.8222 8956.264 10180.422 9600.573 9033.635
## cvpred 9566.506 7841.5765 8979.854 10120.136 9555.900 9018.743
## price 5266.000 6888.0000 5500.000 5435.000 7950.000 6542.000
## CV residual -4300.506 -953.5765 -3479.854 -4685.136 -1605.900 -2476.743
## 1164 1166 1168 1172 1174 1181
## Predicted 9019.657 10294.252 8957.733 8889.179 9011.378 8982.239
## cvpred 9028.733 10148.467 8985.741 8982.168 9028.146 8993.697
## price 7500.000 8816.000 12675.000 6000.000 7014.000 7000.000
## CV residual -1528.733 -1332.467 3689.259 -2982.168 -2014.146 -1993.697
## 1182 1198 1285 1290 1291 1292
## Predicted 8992.992 8712.288 11513.578 15766.79 10423.956 10252.652
## cvpred 9020.477 8898.763 11509.724 15262.44 10162.876 11442.619
## price 10500.000 6500.000 6400.000 25454.00 6280.000 6660.000
## CV residual 1479.523 -2398.763 -5109.724 10191.56 -3882.876 -4782.619
## 1293 1295 1299 1302 1306 1308
## Predicted 7624.236 12367.717 8604.0169 8852.3739 9572.824 8141.139
## cvpred 8760.105 12283.487 8950.3095 9096.1314 9840.401 8628.661
## price 7504.000 12390.000 9500.0000 9575.0000 7200.000 10000.000
## CV residual -1256.105 106.513 549.6905 478.8686 -2640.401 1371.339
## 1310 1312 1318 1320 1322 1324
## Predicted 10387.688 10561.047 9563.4883 10464.935 9127.772 10920.48
## cvpred 10411.197 10202.112 9763.7814 10522.661 9179.061 10797.00
## price 8454.000 12169.000 10000.0000 8000.000 7857.000 7579.00
## CV residual -1957.197 1966.888 236.2186 -2522.661 -1322.061 -3218.00
## 1326 1327 1330 1336 1342 1383
## Predicted 13153.129 8958.598 8251.479 8164.552 7016.117 9324.232
## cvpred 12457.205 9068.536 8399.179 8325.174 8150.288 9477.129
## price 20343.000 5100.000 6429.000 5877.000 5113.000 7500.000
## CV residual 7885.795 -3968.536 -1970.179 -2448.174 -3037.288 -1977.129
## 1385 1386 1389 1391 1394 1395
## Predicted 9447.255390 9871.539 8796.513 8158.4589 8326.5782 7559.847
## cvpred 9509.294221 10009.566 8907.066 8323.0833 8413.5182 7782.660
## price 9500.000000 5486.000 11500.000 8657.0000 9082.0000 5500.000
## CV residual -9.294221 -4523.566 2592.934 333.9167 668.4818 -2282.660
## 1396 1397 1398 1400 1401 1403
## Predicted 11271.226 8471.802 10219.506 8207.6949 10845.563 10128.068
## cvpred 11144.704 8190.292 10095.455 8347.8306 10655.606 10049.496
## price 8800.000 18018.000 5365.000 8080.0000 16429.000 16000.000
## CV residual -2344.704 9827.708 -4730.455 -267.8306 5773.394 5950.504
## 1404 1408 1411 1414 1420 1422
## Predicted 8221.762 8013.204 10668.474 8981.460 8190.178 11333.150
## cvpred 8354.901 9146.911 10637.804 8982.224 8368.695 11187.696
## price 7143.000 5500.000 9000.000 7117.000 7314.000 14200.000
## CV residual -1211.901 -3646.911 -1637.804 -1865.224 -1054.695 3012.304
## 1423 1429 1430 1436 1439 1440
## Predicted 10662.821 8786.720 8319.544 8872.504 9549.177 8297.063
## cvpred 10623.094 8925.879 8409.983 8957.129 9530.853 8410.551
## price 15000.000 5221.000 5999.000 5356.000 5352.000 6000.000
## CV residual 4376.906 -3704.879 -2410.983 -3601.129 -4178.853 -2410.551
## 1441 1452 1454 1459 1461 1466
## Predicted 7968.932 8816.925 9698.265 8341.336 12751.87 8964.2183
## cvpred 9142.460 8923.259 9593.921 8415.002 12350.20 8683.2715
## price 5500.000 6400.000 8466.000 5999.000 28514.00 8500.0000
## CV residual -3642.460 -2523.259 -1127.921 -2416.002 16163.80 -183.2715
## 1467 1821 1822 1824 1913 1921
## Predicted 10261.708 6217.515 6990.267 10772.159 6910.6191 6927.446
## cvpred 10116.667 7621.432 8420.281 11288.952 7228.6646 7213.387
## price 7227.000 6200.000 10000.000 5110.000 7360.0000 6187.000
## CV residual -2889.667 -1421.432 1579.719 -6178.952 131.3354 -1026.387
## 1923 2206 2207 2218 2265 2269
## Predicted 6338.1873 8715.799 9613.753 5984.612 6987.71 7059.18041
## cvpred 7515.1628 8805.151 9986.080 6954.802 7012.73 7182.05871
## price 6615.0000 5120.000 15000.000 10000.000 5455.00 7201.00000
## CV residual -900.1628 -3685.151 5013.920 3045.198 -1557.73 18.94129
## 2270 2305 2307 2318 2320 2325
## Predicted 7217.9276 12672.517 9623.054 10271.08079 8598.127 9618.269
## cvpred 7166.9073 12345.134 9555.332 10154.62209 8944.260 9569.155
## price 7413.0000 10000.000 6500.000 10238.00000 12500.000 5474.000
## CV residual 246.0927 -2345.134 -3055.332 83.37791 3555.740 -4095.155
## 2327 2329 2332 2335 2336 2337 2339
## Predicted 8338.710 9025.445 9652.658 8238.769 8985.868 8270.156 8773.199
## cvpred 8429.125 9035.217 9575.359 8373.744 8999.882 8425.553 9127.699
## price 9500.000 6857.000 7500.000 7856.000 10000.000 5059.000 7314.000
## CV residual 1070.875 -2178.217 -2075.359 -517.744 1000.118 -3366.553 -1813.699
## 2345 2346 2350 2353 2354 2358
## Predicted 8987.338 9545.996 8311.203 9869.872 9216.836 8359.8114
## cvpred 9005.768 9542.338 8466.774 10069.282 9474.981 8439.7307
## price 8000.000 10304.000 5888.000 13000.000 14545.000 8571.0000
## CV residual -1005.768 761.662 -2578.774 2930.718 5070.019 131.2693
## 2377 2378 2380 2381 2382 2384
## Predicted 10444.910 8192.248 9051.797 8777.616 8099.2836 10936.3114
## cvpred 10572.905 8351.934 9017.577 8939.105 8121.7315 10707.1528
## price 8643.000 6500.000 23334.000 5083.000 8569.0000 11564.0000
## CV residual -1929.905 -1851.934 14316.423 -3856.105 447.2685 856.8472
## 2393 2395 2396 2397 2402 2409
## Predicted 7687.554 7441.232 7694.178 7785.616 10165.441 9448.169
## cvpred 7580.299 7835.785 7844.245 7890.204 10109.818 9512.542
## price 5198.000 5800.000 5800.000 8990.000 19988.000 5500.000
## CV residual -2382.299 -2035.785 -2044.245 1099.796 9878.182 -4012.542
## 2412 2440 2441 2442 2458 2472
## Predicted 9611.925 8885.927 8890.89066 9096.938 9500.031 9369.195
## cvpred 9560.033 8944.501 8964.79776 9050.561 9523.166 9475.993
## price 13714.000 5500.000 9000.00000 7082.000 14455.000 7293.000
## CV residual 4153.967 -3444.501 35.20224 -1968.561 4931.834 -2182.993
## 2507 2508 2513 2522 2524 2525
## Predicted 8930.244 11557.808 8865.1840 8909.967 8884.771 8302.072
## cvpred 8991.298 11289.963 9002.1362 8966.879 8964.510 8405.562
## price 5829.000 8000.000 8214.0000 7600.000 6500.000 8824.000
## CV residual -3162.298 -3289.963 -788.1362 -1366.879 -2464.510 418.438
## 2526 2528 2546 2552 2556 2561
## Predicted 8948.074 9467.712 9674.539 8413.456 9092.844 9068.803
## cvpred 8996.328 9500.202 9597.438 8482.136 9058.798 9036.420
## price 6000.000 7571.000 10993.000 10000.000 25857.000 10143.000
## CV residual -2996.328 -1929.202 1395.562 1517.864 16798.202 1106.580
## 2570 2573 2575 2578 2622 2631
## Predicted 8203.600 10795.126 8938.657 8330.800 8911.302 9088.749
## cvpred 8356.067 10676.154 8983.659 8902.835 8980.991 9067.035
## price 6000.000 12679.000 8000.000 5870.000 6786.000 7500.000
## CV residual -2356.067 2002.846 -983.659 -3032.835 -2194.991 -1567.035
## 2634 2637 2641 2645 2648 2649
## Predicted 9767.2943 8861.197 9079.08979 9705.1467 10275.776 8882.477
## cvpred 9689.6004 8927.710 9077.62228 9637.7734 10123.738 8972.437
## price 10030.0000 11419.000 9000.00000 9003.0000 11660.000 6000.000
## CV residual 340.3996 2491.290 -77.62228 -634.7734 1536.262 -2972.437
## 2655 2656 2661 2663 2664 2666
## Predicted 9595.232 8949.230 8330.207 9200.760 8843.724 8878.893
## cvpred 9567.085 8976.319 8419.703 9195.399 8923.289 8940.966
## price 7871.000 6500.000 6500.000 7500.000 15000.000 6000.000
## CV residual -1696.085 -2476.319 -1919.703 -1695.399 6076.711 -2940.966
## 2667 2787 2792 2793 2796 2797
## Predicted 8449.7802 9552.520 10253.3957 8367.058 10324.081 8834.2914
## cvpred 8479.8039 9625.246 10205.7159 8548.926 10216.994 8976.7732
## price 8357.0000 7320.000 11000.0000 5329.000 15400.000 9429.0000
## CV residual -122.8039 -2305.246 794.2841 -3219.926 5183.006 452.2268
## 2799 2800 2805 2806 2807 2808
## Predicted 9015.697 9627.687 9613.619 9921.993 9024.2007 8533.602
## cvpred 9068.409 9621.490 9614.419 9965.536 9077.8305 8727.836
## price 7900.000 12240.000 9500.000 14900.000 10000.0000 5200.000
## CV residual -1168.409 2618.510 -114.419 4934.464 922.1695 -3527.836
## 2814 2822 2823 2826 2829 2836
## Predicted 8801.079 10321.949 14315.307 11161.498 8892.248 10122.752
## cvpred 8991.320 10330.798 13937.408 11020.304 9107.608 10223.442
## price 6299.000 6000.000 8800.000 17530.000 6500.000 14750.000
## CV residual -2692.320 -4330.798 -5137.408 6509.696 -2607.608 4526.558
## 2839 2840 2841 2843 2883 2890
## Predicted 10014.231 10269.920 9650.470 9607.225 9033.899 9549.216
## cvpred 10239.991 10228.893 9780.217 9645.179 9124.186 9587.796
## price 11429.000 13337.000 14950.000 8000.000 6973.000 15500.000
## CV residual 1189.009 3108.107 5169.783 -1645.179 -2151.186 5912.204
## 2933 2935 2958 2965 2967 2968
## Predicted 9569.361 10870.0457 8196.5665 9781.514 8905.558 8224.011
## cvpred 10098.360 11154.9849 8352.5321 9656.354 8949.221 8372.260
## price 8000.000 10800.0000 7800.0000 7429.000 6000.000 6423.000
## CV residual -2098.360 -354.9849 -552.5321 -2227.354 -2949.221 -1949.260
## 2970 2976 2978 2991 2992 2993
## Predicted 10257.614 10174.72326 8874.798 8262.674 8616.199 8896.634
## cvpred 10124.904 10130.71215 8949.203 8404.348 8926.034 8996.567
## price 8000.000 10201.00000 17500.000 6500.000 5022.000 7714.000
## CV residual -2124.904 70.28785 8550.797 -1904.348 -3904.034 -1282.567
## 2998 3000 3002 3004 3005 3007
## Predicted 8173.4851 8783.826 10323.166 9082.871 8959.758 8188.063
## cvpred 8375.7476 8888.822 10174.081 9043.490 8994.265 8343.111
## price 8000.0000 5750.000 15000.000 5457.000 5162.000 9381.000
## CV residual -375.7476 -3138.822 4825.919 -3586.490 -3832.265 1037.889
## 3012 3206 3211 3219 3243 3245
## Predicted 9043.608 9757.570 9116.918 8434.216 8180.100 7137.5662
## cvpred 9034.051 9769.647 9192.380 8675.094 8402.481 7444.8982
## price 7500.000 6000.000 5300.000 6500.000 6250.000 7628.0000
## CV residual -1534.051 -3769.647 -3892.380 -2175.094 -2152.481 183.1018
## 3289 3292 3311 3319 3326 3331
## Predicted 8851.493 11487.157 8171.926 7611.332 7640.8467 7663.417
## cvpred 8963.583 11228.714 8352.801 7824.767 7827.7342 7844.226
## price 5162.000 9900.000 13600.000 7214.000 8625.0000 5300.000
## CV residual -3801.583 -1328.714 5247.199 -610.767 797.2658 -2544.226
## 3333 3334 3345 3363 3366 3367
## Predicted 7695.6470 7683.049 8214.307 6895.951 6810.167 9002.561
## cvpred 7850.1307 7848.946 8408.134 7244.241 7212.991 8992.830
## price 7636.0000 6564.000 5899.000 10000.000 10000.000 5782.000
## CV residual -214.1307 -1284.946 -2509.134 2755.759 2787.009 -3210.830
## 3368 3369 3370 3372 3440 3448
## Predicted 10804.830 8903.578 9122.742 6973.9225 8249.297 7670.451
## cvpred 10640.281 8983.043 9501.479 7260.4825 8391.690 7847.762
## price 13065.000 7738.000 5750.000 7406.0000 5229.000 26526.000
## CV residual 2424.719 -1245.043 -3751.479 145.5175 -3162.690 18678.238
##
## Sum of squares = 4995454715 Mean square = 14032176 n = 356
##
## fold 3
## Observations in test set: 356
## 112 115 123 125 129 141
## Predicted 8200.061 7582.4176 8925.970 8870.390 8247.9117 8478.039
## cvpred 7969.928 7312.4682 8713.436 8779.034 8132.6773 8096.572
## price 10000.000 6500.0000 9800.000 5372.000 9000.0000 13520.000
## CV residual 2030.072 -812.4682 1086.564 -3407.034 867.3227 5423.428
## 145 146 169 173 174 182
## Predicted 7893.253 9943.198 8339.3893 10296.473 8730.233 10661.24
## cvpred 7473.435 9076.523 8050.2305 10272.051 8263.317 10605.27
## price 6000.000 13285.000 8182.0000 9064.000 5915.000 8999.00
## CV residual -1473.435 4208.477 131.7695 -1208.051 -2348.317 -1606.27
## 189 196 197 214 220 234
## Predicted 8892.131 8306.20630 8161.095 7557.614 9034.077 8145.843
## cvpred 8835.411 7981.28176 8185.087 7194.223 8422.830 7522.462
## price 10325.000 8053.00000 12390.000 6131.000 5940.000 8803.000
## CV residual 1489.589 71.71824 4204.913 -1063.223 -2482.830 1280.538
## 286 290 292 306 308 309
## Predicted 9526.0511 10153.577 9544.806 8821.1538 8810.581 8844.459
## cvpred 9494.3384 10224.651 9092.706 8805.2969 8696.862 8541.215
## price 8500.0000 9419.000 6599.000 9280.0000 6000.000 12000.000
## CV residual -994.3384 -805.651 -2493.706 474.7031 -2696.862 3458.785
## 310 313 325 326 328 329
## Predicted 8858.437 8372.0423 8084.072 8161.3529 8268.239 8201.530
## cvpred 8599.434 7680.5391 7852.048 7876.5001 7891.385 7975.369
## price 5640.000 6899.0000 5800.000 7434.0000 6800.000 6109.000
## CV residual -2959.434 -781.5391 -2052.048 -442.5001 -1091.385 -1866.369
## 357 363 368 371 382 388 389
## Predicted 7429.880 7465.138 6590.096 6728.567 6978.241 7571.934 8458.623
## cvpred 7142.185 7057.702 6093.694 6271.481 6546.200 7138.310 7491.083
## price 9280.000 8900.000 9999.000 22556.000 10219.000 5500.000 23904.000
## CV residual 2137.815 1842.298 3905.306 16284.519 3672.800 -1638.310 16412.917
## 395 398 400 406 407 410
## Predicted 9566.094 9572.348 8196.566 8169.0766 8174.0405 7975.116
## cvpred 9430.804 9457.193 8085.867 7908.3302 7797.8323 7730.417
## price 6204.000 13098.000 6000.000 7000.0000 7000.0000 5162.000
## CV residual -3226.804 3640.807 -2085.867 -908.3302 -797.8323 -2568.417
## 412 413 416 432 436 437
## Predicted 9452.775 7920.2710 9493.687 11945.19 8905.738 8934.562
## cvpred 9520.975 7537.7476 9361.577 9146.90 8628.828 8649.403
## price 5828.000 7000.0000 6486.000 7899.00 6000.000 5286.000
## CV residual -3692.975 -537.7476 -2875.577 -1247.90 -2628.828 -3363.403
## 440 442 445 450 451 453
## Predicted 9077.307 8783.6737 8641.0240 10980.987 9704.8330 9029.540
## cvpred 8681.110 8875.0895 9279.0199 10860.792 9411.4227 8712.815
## price 5500.000 9500.0000 10000.0000 7400.000 8892.0000 27438.000
## CV residual -3181.110 624.9105 720.9801 -3460.792 -519.4227 18725.185
## 493 494 496 497 498 501 512
## Predicted 9525.361 7739.3188 9631.557 8314.76 8842.345 8250.766 8855.722
## cvpred 9458.756 7270.9491 9438.060 7950.92 8728.318 7949.105 8685.233
## price 15116.000 6435.0000 14036.000 6525.00 7500.000 9781.000 13000.000
## CV residual 5657.244 -835.9491 4597.940 -1425.92 -1228.318 1831.895 4314.767
## 513 522 523 529 532 533
## Predicted 8897.234 11584.339 8983.243 9528.990 8392.3547 9536.024
## cvpred 8627.139 11429.075 8749.960 9505.221 8006.3294 9501.469
## price 5500.000 8000.000 6000.000 7000.000 8500.0000 12186.000
## CV residual -3127.139 -3429.075 -2749.960 -2505.221 493.6706 2684.531
## 535 547 550 552 556 557
## Predicted 8451.877 10230.95 9082.871 9106.911 8990.142 8220.562
## cvpred 8018.154 10183.38 8671.917 8671.544 8583.805 7857.367
## price 25000.000 7848.00 26000.000 5343.000 6000.000 5929.000
## CV residual 16981.846 -2335.38 17328.083 -3328.544 -2583.805 -1928.367
## 558 571 574 580 582 586
## Predicted 8309.1059 9567.563 9050.641 9592.9385 8001.916 8854.4318
## cvpred 8025.8359 9436.245 8701.559 9278.9093 7872.372 8548.3457
## price 8014.0000 6000.000 6500.000 9077.0000 5123.000 9318.0000
## CV residual -11.8359 -3436.245 -2201.559 -201.9093 -2749.372 769.6543
## 588 591 598 602 615 619
## Predicted 8943.666 9035.732 11457.73 9513.453 9590.824 9032.2549
## cvpred 8752.397 8765.537 11496.61 9507.283 9466.012 8627.0165
## price 6468.000 15000.000 25000.00 10734.000 7911.000 9555.0000
## CV residual -2284.397 6234.463 13503.39 1226.717 -1555.012 927.9835
## 628 632 634 635 703 704
## Predicted 9637.211 11947.343 8421.959083 10027.105 9560.198 9029.540
## cvpred 9363.144 12444.288 7996.762815 9932.614 9141.524 8712.815
## price 12600.000 10850.000 8000.000000 11211.000 12400.000 6909.000
## CV residual 3236.856 -1594.288 3.237185 1278.386 3258.476 -1803.815
## 707 710 712 714 718 723
## Predicted 9078.7761 8703.184 9567.098 9001.4052 9036.574 8786.765
## cvpred 8686.5513 8514.949 9497.343 8727.8224 8709.063 8793.916
## price 8900.0000 7000.000 5400.000 8000.0000 6229.000 5250.000
## CV residual 213.4487 -1514.949 -4097.343 -727.8224 -2480.063 -3543.916
## 725 728 729 732 733 741
## Predicted 8266.9036 8983.243 9631.557 10232.418 8319.0786 8140.297
## cvpred 8048.3474 8749.960 9438.060 10188.821 8032.9663 8115.882
## price 7286.0000 5300.000 7000.000 13653.000 9000.0000 8000.000
## CV residual -762.3474 -3449.960 -2438.060 3464.179 967.0337 -115.882
## 742 743 748 767 768 770
## Predicted 9018.412 8955.108 9531.615 7914.017 8889.421 7744.883
## cvpred 8731.201 8764.968 9485.145 7511.359 8661.032 7261.756
## price 7532.000 12400.000 15000.000 11000.000 5300.000 24175.000
## CV residual -1199.201 3635.032 5514.855 3488.641 -3361.032 16913.244
## 773 775 777 781 783 784
## Predicted 11255.18 8286.5353 8504.580 7618.2761 8377.284 8298.53288
## cvpred 11382.73 8031.6505 7915.341 7329.2906 7947.294 7917.40072
## price 19800.00 7500.0000 7714.000 6600.0000 6000.000 8000.00000
## CV residual 8417.27 -531.6505 -201.341 -729.2906 -1947.294 82.59928
## 791 800 801 811 889 895
## Predicted 9531.615 8221.852 8170.546 9626.638 9236.682 8532.7125
## cvpred 9485.145 7994.254 7913.771 9254.709 7891.905 7774.7504
## price 25000.000 5358.000 13857.000 8056.000 18518.000 8500.0000
## CV residual 15514.855 -2636.254 5943.229 -1198.709 10626.095 725.2496
## 897 899 901 904 907 910
## Predicted 9028.4514 9658.244 9551.539 14093.783 10523.03 10738.274
## cvpred 8667.3182 8909.510 9386.367 14803.474 11121.96 10611.131
## price 9286.0000 5800.000 12342.000 9566.000 7000.00 5500.000
## CV residual 618.6818 -3109.510 2955.633 -5237.474 -4121.96 -5111.131
## 912 917 920 922 925 926
## Predicted 7920.421 8865.555 9358.852 8819.7624 9626.505 9578.885
## cvpred 7459.819 8790.137 8884.649 8730.4637 9226.472 10167.563
## price 20000.000 6195.000 6429.000 9329.0000 10500.000 7000.000
## CV residual 12540.181 -2595.137 -2455.649 598.5363 1273.528 -3167.563
## 930 935 940 941 948 1071 1072
## Predicted 10044.758 8642.136 8355.046 11088.278 10495.64 9319.252 11234.471
## cvpred 9660.069 8176.363 7863.773 12220.095 10289.44 8771.998 11164.424
## price 11335.000 18571.000 7000.000 7900.000 26474.00 10670.000 5286.000
## CV residual 1674.931 10394.637 -863.773 -4320.095 16184.56 1898.002 -5878.424
## 1075 1077 1101 1106 1107 1113
## Predicted 9071.917 9627.556 10279.477 8240.238 10208.467 9014.0032
## cvpred 8821.083 8917.832 9854.019 8068.796 10123.472 8714.8774
## price 6500.000 14500.000 7300.000 13000.000 7771.000 8000.0000
## CV residual -2321.083 5582.168 -2554.019 4931.204 -2352.472 -714.8774
## 1118 1120 1129 1134 1137 1149
## Predicted 10230.9483 8857.792 8926.660 9670.910 7723.7819 10107.746
## cvpred 10183.3799 8791.978 8749.018 9338.943 7273.0118 10200.698
## price 10410.0000 17143.000 6800.000 25300.000 7168.0000 8900.000
## CV residual 226.6201 8351.022 -1949.018 15961.057 -105.0118 -1300.698
## 1156 1158 1163 1167 1169 1171
## Predicted 9007.059 9540.898 8238.769 9665.5696 8980.618 8962.142
## cvpred 8652.906 9456.694 8063.355 9444.8165 8770.036 8761.216
## price 5800.000 5357.000 17500.000 10200.0000 6000.000 5500.000
## CV residual -2852.906 -4099.694 9436.645 755.1835 -2770.036 -3261.216
## 1178 1185 1197 1279 1281 1287
## Predicted 8903.247 10753.614 9413.135 11267.924 9469.327 9147.292
## cvpred 8811.307 10777.427 9027.770 12303.541 9176.944 9719.010
## price 6000.000 17243.000 6000.000 14167.000 15985.000 8173.000
## CV residual -2811.307 6465.573 -3027.770 1863.459 6808.056 -1546.010
## 1289 1294 1297 1298 1301 1304
## Predicted 9661.648 7638.304 12425.840 11197.59 9854.1692 11290.88
## cvpred 8859.293 7407.642 13659.731 12341.06 10580.5203 11148.48
## price 5723.000 13100.000 8970.000 6680.00 9990.0000 7999.00
## CV residual -3136.293 5692.358 -4689.731 -5661.06 -590.5203 -3149.48
## 1311 1313 1315 1316 1325 1328
## Predicted 10410.169 9057.597 9365.893 9000.166 10335.249 7783.8551
## cvpred 10296.301 8876.996 9997.854 8560.933 10186.717 7420.2405
## price 8599.000 13575.000 6520.000 7000.000 11757.000 7143.0000
## CV residual -1697.301 4698.004 -3477.854 -1560.933 1570.283 -277.2405
## 1333 1335 1337 1339 1340 1375
## Predicted 9150.373 14402.078 10905.214 9012.976 9012.976 8582.876
## cvpred 8807.474 15814.535 10601.196 8434.085 8434.085 8315.023
## price 10000.000 8400.000 19562.000 15000.000 15000.000 5314.000
## CV residual 1192.526 -7414.535 8960.804 6565.915 6565.915 -3001.023
## 1381 1390 1392 1393 1405 1406
## Predicted 8816.925 10104.207 9544.213 9079.932 9023.662 8953.325
## cvpred 8657.528 10088.511 9472.200 8661.035 8691.050 8728.569
## price 6500.000 8900.000 7500.000 5250.000 10000.000 14429.000
## CV residual -2157.528 -1188.511 -1972.200 -3411.035 1308.950 5700.431
## 1410 1412 1413 1415 1416 1417
## Predicted 10203.369 7975.8508 8799.40742 8519.017 7594.326 8804.3713
## cvpred 10071.566 7472.1503 8487.12176 7953.637 7263.941 8376.6239
## price 5980.000 8000.0000 8529.00000 11900.000 8500.000 8110.0000
## CV residual -4091.566 527.8497 41.87824 3946.363 1236.059 -266.6239
## 1419 1421 1424 1427 1428 1431
## Predicted 8482.3345 8742.448 9570.968 9505.595 7680.110 8918.1564
## cvpred 8260.8044 8481.555 9386.029 9313.050 7290.082 8747.3289
## price 8500.0000 6300.000 12826.000 15073.000 5990.000 8129.0000
## CV residual 239.1956 -2181.555 3439.971 5759.950 -1300.082 -618.3289
## 1433 1434 1435 1443 1446 1447
## Predicted 8900.639 10057.865 9450.015 9520.352 9371.954347 11405.557
## cvpred 8576.923 9897.531 9378.647 9341.128 9384.336538 11511.991
## price 15086.000 17962.000 5500.000 15514.000 9390.000000 18900.000
## CV residual 6509.077 8064.469 -3878.647 6172.872 5.663462 7388.009
## 1449 1450 1451 1462 1463 1464
## Predicted 11975.344 8929.464 9570.278 9572.348 8278.032 8158.459
## cvpred 12266.878 8597.497 9350.447 9457.193 8029.961 8093.744
## price 11050.000 5280.000 5473.000 15514.000 13800.000 6000.000
## CV residual -1216.878 -3317.497 -3877.447 6056.807 5770.039 -2093.744
## 1468 1763 1839 1841 1906 1915 1916
## Predicted 7539.4357 7081.413 7033.849 7822.818 6973.643 7061.087 7177.486
## cvpred 7365.1205 6535.664 6339.316 8344.787 7510.764 6561.459 7761.530
## price 7000.0000 10000.000 10000.000 11644.000 9999.000 8318.000 6000.000
## CV residual -365.1205 3464.336 3660.684 3299.213 2488.236 1756.541 -1761.530
## 2131 2180 2205 2220 2271 2298
## Predicted 7564.909 7325.637 6794.0742 7049.269 8127.3323 9438.107
## cvpred 7871.211 7616.351 5984.8881 6544.263 8138.1998 9427.174
## price 11644.000 5750.000 5100.0000 13900.000 9000.0000 7527.000
## CV residual 3772.789 -1866.351 -884.8881 7355.737 861.8002 -1900.174
## 2299 2309 2311 2312 2314 2315
## Predicted 8638.546 9571.0578 11432.357 9481.089 8838.160 11020.564
## cvpred 8183.703 9320.3059 11653.945 9374.522 8808.675 10858.356
## price 14000.000 10000.0000 20750.000 15656.000 10500.000 16000.000
## CV residual 5816.297 679.6941 9096.055 6281.478 1691.325 5141.644
## 2319 2338 2340 2342 2343 2344
## Predicted 8990.277 8765.977 8928.443 8970.824 9659.692 8761.52398
## cvpred 8746.209 8314.154 8785.417 8631.460 9423.052 8591.67954
## price 15000.000 6533.000 12000.000 5500.000 6500.000 8500.00000
## CV residual 6253.791 -1781.154 3214.583 -3131.460 -2923.052 -91.67954
## 2347 2370 2375 2383 2386 2389
## Predicted 9652.972 8834.576 8167.697 8798.583 10226.540 7448.2660
## cvpred 9457.762 8468.362 7837.166 8811.112 10167.056 6694.7519
## price 8000.000 5255.000 6508.000 19998.000 6499.000 5800.0000
## CV residual -1457.762 -3213.362 -1329.166 11186.888 -3668.056 -894.7519
## 2391 2394 2399 2403 2411 2413 2433
## Predicted 7466.1930 6499.52 7680.110 9678.264 8858.571 10315.35 8248.741
## cvpred 7947.3499 5043.83 7290.082 10483.106 8761.838 10138.36 8070.485
## price 7198.0000 6200.00 5990.000 8013.000 7095.000 22500.00 13338.000
## CV residual -749.3499 1156.17 -1300.082 -2470.106 -1666.838 12361.64 5267.515
## 2443 2445 2446 2456 2460 2462
## Predicted 10122.593 8968.262 9602.732 10184.427 10213.942 10412.586
## cvpred 10163.054 8625.202 9417.485 10123.845 10180.001 11294.026
## price 16894.000 13700.000 6500.000 14321.000 6971.000 8000.000
## CV residual 6730.946 5074.798 -2917.485 4197.155 -3209.001 -3294.026
## 2464 2473 2474 2487 2506 2511
## Predicted 11379.492 8234.450 12533.823 10154.312 8876.268 8456.124
## cvpred 11633.744 7981.309 13171.596 9966.384 8800.798 7911.464
## price 15941.000 6000.000 6836.000 12300.000 5300.000 6620.000
## CV residual 4307.256 -1981.309 -6335.596 2333.616 -3500.798 -1291.464
## 2515 2517 2519 2523 2527 2529
## Predicted 7562.786 9711.867 8896.213 8561.630 8223.546 10785.199
## cvpred 7329.165 9407.671 8815.059 7855.185 8096.375 10940.329
## price 5800.000 26302.000 7500.000 5001.000 6714.000 8143.000
## CV residual -1529.165 16894.329 -1315.059 -2854.185 -1382.375 -2797.329
## 2530 2531 2537 2538 2539 2541 2543
## Predicted 8798.583 11585.718 9063.239 10195.78 8857.882 10226.854 7678.954
## cvpred 8811.112 11500.239 8688.614 10202.14 8726.256 10198.014 7315.599
## price 6286.000 6929.000 5312.000 17900.00 5500.000 8631.000 8654.000
## CV residual -2525.112 -4571.239 -3376.614 7697.86 -3226.256 -1567.014 1338.401
## 2544 2547 2558 2566 2568 2574
## Predicted 10797.1066 9765.197 8410.517 8970.331 10097.3077 8292.2787
## cvpred 10891.8025 9366.773 7984.191 8731.948 10254.6664 7891.0118
## price 10721.0000 20000.000 26000.000 7357.000 10933.0000 7113.0000
## CV residual -170.8025 10633.227 18015.809 -1374.948 678.3336 -778.0118
## 2576 2621 2624 2626 2627 2628
## Predicted 8358.342 9611.925 8183.969 8969.1756 8465.407 8265.88210
## cvpred 7999.572 9454.757 8098.812 8757.4643 7883.012 7714.29213
## price 5500.000 5800.000 6200.000 8500.0000 14545.000 7700.00000
## CV residual -2499.572 -3654.757 -1898.812 -257.4643 6661.988 -14.29213
## 2629 2644 2651 2659 2781 2782
## Predicted 10236.513 8978.145 8966.702 8948.074 10170.943 9517.352
## cvpred 10174.187 8698.055 8685.483 8768.720 9967.187 9358.628
## price 7500.000 5580.000 6000.000 9500.000 7600.000 8160.000
## CV residual -2674.187 -3118.055 -2685.483 731.280 -2367.187 -1198.628
## 2783 2784 2788 2789 2798 2801
## Predicted 10377.137 8706.4825 8833.438 8975.335 8347.4383 10283.348
## cvpred 9828.819 8264.9883 8456.965 8587.778 8116.6875 10266.728
## price 11714.000 9000.0000 11650.000 7227.000 8325.0000 9000.000
## CV residual 1885.181 735.0117 3193.035 -1360.778 208.3125 -1266.728
## 2809 2810 2815 2817 2821 2884
## Predicted 9641.754 9174.040 9333.567 9717.128 9354.422 9117.109
## cvpred 9543.919 9037.856 8978.310 9478.938 8865.084 8889.248
## price 7200.000 5914.000 5098.000 6500.000 7857.000 5032.000
## CV residual -2343.919 -3123.856 -3880.310 -2978.938 -1008.084 -3857.248
## 2885 2888 2889 2907 2912 2929 2934
## Predicted 9050.0363 10132.6828 8350.153 7789.425 8420.49 8462.126 9620.826
## cvpred 8862.4796 9750.4091 8030.889 7260.667 7993.37 8096.059 8952.969
## price 7950.0000 10000.0000 9292.000 5036.000 6688.00 8000.000 10856.000
## CV residual -912.4796 249.5909 1261.111 -2224.667 -1305.37 -96.059 1903.031
## 2959 2960 2962 2963 2971 2972
## Predicted 10348.452 9481.089 8911.902 9004.344 8981.460 8276.8763
## cvpred 10012.851 9374.522 8720.940 8738.705 8713.562 8055.4778
## price 6306.000 5454.000 5317.000 6571.000 14500.000 7500.0000
## CV residual -3706.851 -3920.522 -3403.940 -2167.705 5786.438 -555.4778
## 2974 2977 2979 2981 2983 2997
## Predicted 8953.504 9574.131 8960.672 8410.517 9441.6465 10159.142
## cvpred 8597.124 9493.591 8755.775 7984.191 9539.3615 10215.458
## price 6000.000 22500.000 15000.000 6000.000 9000.0000 9053.000
## CV residual -2597.124 13006.409 6244.225 -1984.191 -539.3615 -1162.458
## 3001 3008 3009 3010 3215 3216
## Predicted 8939.661 8914.062 9486.788 8327.582 8868.073 8809.044
## cvpred 8701.308 8761.963 9527.732 8034.655 8570.317 8458.004
## price 5100.000 5500.000 6000.000 5100.000 11211.000 11423.000
## CV residual -3601.308 -3261.963 -3527.732 -2934.655 2640.683 2964.996
## 3218 3221 3222 3225 3236 3238
## Predicted 8432.146 8464.421 8445.081 8284.550 8472.9014 9065.151
## cvpred 8103.050 8301.534 8126.352 8119.359 7775.9578 8418.704
## price 6500.000 6500.000 6950.000 5143.000 7136.0000 5162.000
## CV residual -1603.050 -1801.534 -1176.352 -2976.359 -639.9578 -3256.704
## 3242 3293 3294 3307 3308 3312
## Predicted 7752.4331 9697.575 9441.646 8305.0112 8330.20707 8702.350
## cvpred 7274.6517 9318.494 9539.361 8040.4701 8014.58014 8170.897
## price 6500.0000 6200.000 6190.000 7334.0000 8000.00000 6000.000
## CV residual -774.6517 -3118.494 -3349.361 -706.4701 -14.58014 -2170.897
## 3314 3316 3317 3329 3332 3344 3365
## Predicted 8312.045 8407.578 8230.266 7648.6600 7567.571 7702.681 6956.361
## cvpred 8036.718 7973.309 8061.666 7289.5832 7350.113 7284.268 6587.597
## price 6499.000 6214.000 6000.000 7500.0000 5895.000 8300.000 8000.000
## CV residual -1537.718 -1759.309 -2061.666 210.4168 -1455.113 1015.732 1412.403
## 3371 3375 3377 3379 3438 3449
## Predicted 6910.019 8506.050 9568.253 7009.226 7552.813 7526.8378
## cvpred 6396.616 7920.782 9471.827 6607.798 7322.035 7378.0655
## price 10000.000 5785.000 7827.000 9000.000 10000.000 6562.0000
## CV residual 3603.384 -2135.782 -1644.827 2392.202 2677.965 -816.0655
##
## Sum of squares = 7723999018 Mean square = 21696626 n = 356
##
## Overall (Sum over all 356 folds)
## ms
## 16902138
There are many other validation techniques and comparison charts that cannot be made automatically due to the nature of the categorical variables. The statistics above for our final model are satisfactory for our purposes.
Again, the final model chosen to predict the price of an Airbnb Listing as below:
We found that bedrooms and host response rate had quite an influence model. However, while validating model after adding amenities the model was much better to predict and we decided to reject the inclusion of amenities to the model.
We hope this helps you gain a general expectation of the price you should be willing to pay given the type of listing and amenities you are looking for. Please reach out with your dream ideas and we can help you determine must-haves vs would-be-nice to find an Airbnb that meets your budget.
We wish you the best of luck on your Airbnb Listing!